loratory Average

نویسندگان

  • DoKyeong Ok
  • Prasad Tadepalli
چکیده

We introduce a model-based average reward Reinforcement Learning method called H-learning and compare it with its discounted counterpart, Adaptive Real-Time Dynamic Programming, in a simulated robot scheduling task. We also introduce an extension to H-learning, which automatically explores the unexplored parts of the state space, while always choosing greedy actions with respect to the current value function. We show that this “Auto-exploratory H-learning” performs better than the original H-learning under previously studied exploration methods such as random, recency-based, or counter-based ex-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Technology in S-iritual 2ormation4 An 67-loratory Study of Com-uter Mediated Religious Communications

1,$78-1N>>+("#>(-:"& & ABSTRA&T In this paper, we report findings from a study of American Christian ministers’ uses of technologies in religious practices. We focus on the use of technologies for spiritual purposes as opposed to pragmatic and logistical, but report on all. We present results about the uses of technologies in three aspects of religious work: religious study and reflection, chur...

متن کامل

Impulse Purchasing Behaviors of the Turkish Consumers in Websites as a Dynamic Consumer Model: Technology Products Example

This paper examines the concept of imp ulse purchasing be havior online basically . The phe nomenon of impulse purchasing has been researched in consumer research as well as for example in psychology and economics since the 1950s. A detailed review and anal ysis of the literature asserts that there are some unsolved issues regarding the state of know ledge on impulse p urchasing be havior. F ur...

متن کامل

Incentives to Settle Under Joint and Several Liability: An Empirical Analysis of Superfund Litigation

Congress may soon restrict join t and several l iabi l i ty for c leanup of contaminated sites under Superfund. We explore whether this change would discourage settle­ ments and is therefore l ikely to increase the program ' s already high litigation costs per site. Recent theoretical research by Kornhauser and Revesz finds that joint and several l iab i l i ty may either encourage or discourag...

متن کامل

Inbreeding Effects on Average Daily Gains and Kleiber Ratios in Iranian Moghani Sheep

The objective of the present study was to evaluate the effects of inbreeding on average daily gains and Kleiber ratios in Moghani sheep. Traits included average daily gain from birth to 3 months (ADG1), average daily gain from birth to 6 months (ADG2), average daily gain from 3 months to 6 months (ADG3), average daily gain from 3 months to 9 months (ADG4), average daily gain from 3 months to ye...

متن کامل

LIMIT AVERAGE SHADOWING AND DOMINATED SPLITTING

In this paper the notion of limit average shadowing property is introduced for diffeomorphisms on a compact smooth manifold M and a class of diffeomorphisms is given which has the limit average shadowing property, but does not have the shadowing property. Moreover, we prove that for a closed f-invariant set Lambda  of a diffeomorphism f, if Lambda is C1-stably limit average shadowing and t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999